latent space

Diffusion Modelで入力画像を愚直に計算すると次元が大きすぎて計算コスト高すぎ

一旦次元を落とす

@ai__pub: The key idea behind Stable Diffusion:

Training and computing a diffusion model on large 512 x 512 images is _incredibly_ slow and expensive.

Instead, let's do the computation on _embeddings_ of images, rather than on images themselves.

5/15

https://pbs.twimg.com/media/FasS_RlVEAEE1cQ.jpg

@ai__pub: The latent space representation z(x) has much smaller dimension than the image x.

This makes the _latent_ diffusion model much faster and more expressive than an ordinary diffusion model.

See dimensions from the SD paper:

7/15

https://pbs.twimg.com/media/FasTAQEVEAA2cvs.jpg